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ABSTRACT 

The rpoH genes encoding homologs of Escherichia 
coli a 32 (heat shock a factor) were isolated and 
sequenced from five gram negative proteobacteria (y 
or a subgroup): Enterobacter cloacae fy) y Serratia 
marcescens (y) f Proteus mirabilis (y), Agrobacterium 
tumefaciens(o) and Zymomonasmobilis(a). Compari- 
son of these and three known genes from E.coli (y), 
Citrobacter freundii(y) and Pseudomonas aeruginosa 
(7) revealed marked similarities that should reflect 
conserved function and regulation of a 32 in the heat 
shock response. Both the sequence complementary to 
part of 16S rRNA (the 'downstream box') and a 
predicted mRNA secondary structure similar to those 
involved in translational control of a 32 in E.coli were 
found for the rpoH genes from the y, but not the a, 
subgroup, despite considerable divergence in nucleo- 
tide sequence. Moreover, a stretch of nine amino acid 
residues Q(R/K)(K/R)LFFNLR, designated the 'RpoH 
box', was absolutely conserved among all o 32 homo- 
logs, but absent in other c factors; this sequence 
overlapped with the segment of polypeptide thought to 
be involved in DnaK/DnaJ chaperone-mediated nega- 
tive control of synthesis and stability of a 32 . In 
addition, a putative a 6 (a^specific promoter was 
found in front of all rpoH genes from the y, but not a, 
subgroup. These results suggest that the regulatory 
mechanisms, as well as the function, of the heat shock 
response known in E.coli are very well conserved 
among the y subgroup and partially conserved among 
the a proteobacteria. 

INTRODUCTION 

Induction of heat shock proteins under heat or other stress 
conditions represents a ubiquitous cellular response that occurs at 
the level of transcription ( 1 ). In Kcoli the rpoH (htpR y hin) gene 
encoding a minor a factor of 32 kDa (a 32 ) plays a major role in 
this homeostatic process (2-4). a 32 is not only required for 
recognizing promoters for transcription of heat shock genes, but 
serves as the central regulatory factor (5,6). During steady-state 
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growth the level of a 32 is kept very low (10-30 molecules/cell) 
primarily because of limited translation of rpoH mRNA and the 
unusual instability of a 32 . Upon a shift to higher temperature the 
a 32 level increases rapidly, though transiently, by both increased 
synthesis and stabilization. This transient increase in the a 32 level 
essentially accounts for induction of heat shock proteins under 
stress conditions (5,6). 

Extensive analyses of heat-induced synthesis of a o 32 ~P-galac- 
tosidase fusion protein from a rpoH-lacZ gene fusion revealed 
that two proximal coding regions (A and B) of rpoH mRNA are 
involved in translational repression during steady-state growth 
and marked induction upon temperature upshift, primarily by 
modulating the stability of the mRNA secondary structure (7-9). 
Whereas region A (nt 6-20) represents a translational enhancer 
('downstream box*; 10) which is complementary to 16S rRNA 
(nt 1469-1483), region B (nt 110-210) contains a sequence that 
forms an internal mRNA secondary structure involving the 
initiation codon and downstream box and thus serves to modulate 
efficiency of translation (8,9). Furthermore, a segment of a 32 
polypeptide (region C; around residues 122-144), which corre- 
sponds to further downstream of rpoH, was found to be important 
for DnaK/DnaJ chaperone-mediated negative feedback control of 
the heat shock response by modulating synthesis and stability of 
the fusion protein (11,12). 

On the other hand, no evidence for involvement of a 32 -like 
factors in heat shock gene expression has been found with gram 
positive bacteria, such as Bacillus subtilis (13) and Clostridium 
acetobutylicwn (14). Instead, a highly conserved inverted repeat 
sequence was detected in front of some of the major heat shock 
genes (dnaK and groE), which appeared to play a negative 
regulatory role (13-15). These interesting developments, as well 
as the unique regulatory features associated with different seg- 
ments of rpoH mRNA and a 32 in Kcoli, prompted us to examine 
rpoH homologs from other bacteria at various phylogenetic 
distances. 

The only sequence of a rpoH homolog known when this work 
was initiated was that of Citrobacter freundii, closely related to 
Kcoli (16). The rpoH gene of more distantly related Pseudomo- 
nas aeruginosa was recently isolated and characterized (17,18). 
We have cloned and sequenced the genes from several other 
bacteria, including some that are even more distantly related. The 
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Eco 1 HTDKMO^-ALAPVTW--U>SYIWU^ 

Cfr 1 MTKEMQNL-AIAPVGN — LESYIRAANMfPMLSADEBRAIJUIKLHYQGDLEM 

Eel 1 MTKEMQTL- ALAPVGN - - LESYIRAAWWPMLTAESEKELAKKLHYQGDLEAAKTL I LS HLRFWHVARNYAGYGL 

Sma 1 MTKQflJTL-ALVPOJjS- -LEAYIRAAN&YEIILT^^ 

Pmi 1 MTQEMQSL-ALVPQGS — I EAYIRAANSYFi&TAEBEKELAERLHYEGD^^ 

Pae 1 MTTSLQPVHALVPGAN- - I^YVHSVNSI PIXSPEQERSIJVERIJnfOQDI»BAARQMVIAHUlPWH IAKSY SGYGL 

Atu 1 MARNSLPTITAGEAG UmYUJEIIUCFPMLEFQBBYMLGKRYABH^^ 

Zmo 1 MATSSTLPAWPAI/SGDQSIJWYlJUnRKFPIU^ 



Eco 74 PQADLIQEGNIGLMKAVRRFNPEVGVRLVSPAVHWIKAEIHEYVIJINWRIVKVA 

Cfr 74 RQADLIQEGNIGLMXAVRRPNPEVGVRLVSFAVHW 

Eel 74 PQADLIQEGNIGl>DL\VRIU 7 NPEVGVRLVSFAVHHIKAEIHByVIiWWRIVKV 

Sma 74 PQADLIQE(^IGlilXAVRRFNPEVGVRLVS^^ 

Pmi 74 PQADLIQEGNIGWIKAVRHFNPEVGVRLVSFAVHWIIU^IHEYVLRKWRIVKVA' 

Pae 74 AQADL I QEGNVGLMKAVKRFNPEMGVRLVS F AVHKIKAE I HEF I LRNWR IVKVA' 

Atu 74 PIGEWSEGJJVGUfflAVKKFDPERGFRIATYAlOWIKAS IQBYILRSWSLVKMQTTJ 

Zmo 78 PV^ELISEGNIGLMQGVKKFDPERGFRIATYMWIKASIQEYIUlSWSLVK>CTTi 
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Eco 151 Q DEVEHVAREI/3VTSKDVREMESRltM«}^^ 

Cfr 151 Q DEVEMVAREI/3VSSKDVRKMESRKAAQEKTFDMS SDDES - DSQ PMAPVL YLQDKS SNFADG I EEDNWEDQA 

Eel 151 Q DEVEMVARELGVSSKDVREMESRMAAQrafTPDMSADDDASDSQP 

Sma 151 Q DEVELVARELGrVTSKDVREMESRMAAQDMTTO 

Pmi 151 Q DEVELVAKEU^ESDVREMESRMSAQDMAFD«SA1»SI)-DPHPVAP^ 

Pae 151 N EEVHRVAES I/3VBPREVRKMESRLTGQDMAF DP AADADD- ESAYQS P AHYLEDHR YD P ARQLEDADWSDS S 

Atu 151 DGDLKPEHVKE IATKLQVSEEEVISHNRKLHGDASLNAFIKASEG ESGQHQDWLVDDHESQEAVLIEQDELETR 

Zmo 155 DG0LXPDEVDSIA1NI/SVSNSEVVNMNRRMAHGGDSSLNITMRED GEGQMQDWLVDQEPLQDOQIEEEEBSLVR 

' * V l '* 1 | —32 — | 



ECO 222 ANRLTDAMQGLTJERSQDI IIUJWLDEENKST^ EA 

Cfr 222 ANKLTHAMEt&DERSQDX IRAR)ft«DEX&^^ 

Eel 223 ANKLTFAMEGLDERSQDI IJIAHWJJEEMKST^ 

Sma 223 ADKIAYALEGLDERSQHI IRAKtOjDDO^^ 

Pmi 222 ADRLTLMKTLDERSQDI IRAK9ftJ£DI91KCT ED 

Pae 222 SANIJIFJU^EGLDERSRDILQQRWLSEE-KATL 

Atu 225 RRMLAKAMGVLNDRERRI FRARRLAED- PVTLEELS S EFD I SRERVRQ I EVRAFEKVQEA VQKEALEAARALRWDA 

Zmo 229 HKLL I EAMDRLNDREKH I LTERRLS DN- PKTLEELSQVYGVSREKVRQ I EVRAFDKLQKAI -ME - LAGDRKLLPAMA 



-4.1- 



-4-2- 



Figure 1. Alignment of deduced amino acid sequences among RpoH homologs. Multiple alignment was carried out with CLUSTAL W (23) using sequences of eight 
RpoH homologs from Kcoli (Eco; A94012), Cjreundii (Cfr; S04697), Ecloaceae (EclX S.marcescens (Sma), Rmirabiiis (Pmi); Faeruginosa (Pae; U09560), 
Atumefaciens (Atu) and Zmobilis (Zmo). Numbers below the sequences show generally conserved regions for o factors according to Lonetto et aL (32). Asterisks 
and dots indicate completely or partially conserved residues, respectively. The shaded area refers to the 'RpoH box* (see text). 



selection employed was based on the ability of rpoH homologs 
to functionally complement temperature-sensitive growth of the 
Ecoli hrpoH mutant lacking o^ 2 , which can grow only at or 
below 20°C (19). Compilation of the resulting sequence data 
permitted analysis of eight RpoH homologs (a^ 2 -like proteins) 
altogether from diverse gram negative bacteria. 

MATERIALS AND METHODS 

Bacteria, phage, plasmids and DNAs 

Escherichia coli strain KY1608 (MC4100 ArpoH30::kan 
zhf50::TnlO) t a non-lysogenic version of KY1612 described 
previously (19), was used as the host for initial screening of rpoH 
homologs. Phage XpF 1 3-(grvEp-lacZ) carrying lacZ under 
control of the groE heat shock promoter has been previously 
described (20). Other phage and plasmid vectors were obtained 
from commercial sources. The bacteria used as DNA sources 
were: Serratia marcescens ATCC264, Proteus mirabilis PM-1, 
Enterobacter cloacae (identified by Dr Akira Yokota, Fermenta- 
tion Institute, Osaka), Agwbacterium tumefaciens I AM 12544 (a 



gift of Drs Kan Tanaka and Hideo takahashi, University of 
Tokyo) and Zymomonas mobilis CP4 (a gift of Dr L.O.Ingram, 
University of Florida). 

Cloning and sequencing of genes 

Cells of strain KYI 608 were transfected with a DNA library 
(>10 4 clones) from each donor bacteria, using charomid 9-36 or 
Agtll as a cloning vector. Ampicillin-resistant colonies were 
selected at 30°C, the transformants obtained were lysogenized 
with kpF 1 3-(grvEp-lacZ) and those that exhibited strong red (or 
blue) color on McConkey lactose agar (or L agar containing 
X-gal) were further analyzed. The recombinant DNAs containing 
a putative rpoH homolog were subcloned to 2-4 kb fragments by 
digesting with restriction endonucleases or exonucleases, followed 
by insertion into plasmid pUCl 18. Stepwise deletion derivatives 
of each clone were constructed and the nucleotide sequence was 
determined with a DNA sequencer (Applied Biosystems). A 
search for sequence homology was carried out against nucleotide 
and peptide sequence databases using a BLAST e-mail server 
(21,22). An open reading frame with a sequence closely related to 
Ecoli rpoH was thus detected. 
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Multiple sequence alignment and phylogenetic analysis 

Multiple alignment of sequences was carried out with CLUSTAL 
W (23) or ICOT Free Software (24), followed by minimum 
manual modifications. Phylogenetic analysis was conducted with 
computer programs PROTDIST, NEIGHBOR, PROTPARS, 
SEQBOOT and CONSENSU all in the PHYLIP package, 
version 3.5c (25). Phylogenetic trees were visualized with 
DRAWGRAM, provided in the same package. 

Prediction of RNA secondary structure 

Potential mRNA secondary structures for the 5'-portion of the 
rpoH coding region were predicted using MFOLD (26) and the 
resulting structures visualized with LOOPVIEWER (27). 

Accession number for nucleotide sequences 

The nucleotide sequence data reported in this paper will appear 
in the GSDB, DDBJ, EMBL and NCBI nucleotide sequence 
databases with the following accession numbers. Exloacae, 
D50829; S.marcescens, D50831; Rmirabilis, D50830; 
Atumefaciens, D50828; Zmobilis, D50832. 

RESULTS 

Cloning and sequencing of rpoH homologs 

Infection of the E.coli mutant lacking a 32 (KYI 608) with a 
charomid (or Xgtl 1) library of donor bacterial DNA followed by 
lysogenization and screening for increased lacZ expression, as 
described in Materials and Methods, led us to identify a rpoH 
gene homolog from each of the five gram negative bacteria 
belonging to the y (E.cloacae 9 S.marcescens and P.mirabilis) or 
a (A.tumefaciens and Zmobilis) subgroups of the proteobacteria. 
The putative rpoH homologs thus obtained contained sequences 
that are closely related to that of Exoli rpoH for the entire coding 
region (see below). As expected from the selection employed, all 
the rpoH homologs supported growth of KY1608 at 30°C or 
higher temperatures and permitted heat-induced synthesis of 
major heat shock proteins. Furthermore, some of them (those 
from the y, but not the a, subgroup) caused synthesis of RpoH 
proteins that can cross-react with antiserum against Exoli a 32 . 
Details of the cloning and expression studies will be presented 
elsewhere (K.Nakahigashi etal., unpublished). The isolation and 

Table 1. Overall structure comparison of the RpoH homologs 



characterization of the rpoH homolog from Raeruginosa was 
recently reported (17,18). 

Overall structural similarity of RpoH homologs 

The amino acid sequences predicted from nucleotide sequences 
of eight rpoH homologs, including the five new entries, were 
aligned to analyze their structures (Fig. 1). The RpoH homologs 
from the ysubgroup (the upper six genes) were found to be closely 
related: they showed an almost identical number of amino acid 
residues (284 or 285) and 60% or higher sequence homology 
(identity), containing only a few deletions (or additions) of single 
residues (Fig. 1 and Table 1). This overall similarity in amino acid 
sequence, despite appreciable differences in nucleotide composi- 
tion (the GC content varying between 43.1 and 62.7%), was 
striking. In contrast, the two RpoH homologs from the a subgroup 
(the lower two genes) were significantly larger (300-302 residues) 
and contained more extensive deletions or additions, as well as 
substitutions, thus exhibiting <40% sequence identity with that of 
Exoli. These results are consistent with the known phylogenetic 
distance between the a and y proteobacteria (29). 

Amino acid sequence comparisons of bacterial a factors 
previously r^rrnitted identification of four major regions of 
homology, 1-4, that were further divided into subregions (see 
30-32). We thus calculated similarity scores along the entire 
sequence of the RpoH homologs (data not shown). As might have 
been expected, regions 2.1-2.4 and 4.2 exhibited particularly 
high conservation. However, an additional region of similarity 
appeared to exist between regions 2.4 and 3.1. 

A segments) uniquely conserved among RpoH homologs 

We then compared the above sequence data with those of other a 
factors, such as RpoD homologs (primary a factors, a 70 in gram 
negative bacteria). Inspection of the data suggested that a segment 
highly conserved among RpoH homologs, but not among the 
other o factors, coincides in part with the non-conserved region 
flanked by regions 2.4 and 3.1. To further substantiate this 
possibility we compared sequence similarity among RpoH, RpoD 
and both RpoH and RpoD combined for the entire region 
spanning from 2.1 to 4.2, using the pair of genes from three 
representative bacteria, Exoli, Raeruginosa and A.tumefaciens 
(Fig. 2). 





rpoH gene 
No. of 
nucleotides 


Percent GC 


Percent identity 
(vs E.coli) 


a 32 homolog 
No. of 
amino acids 


Mol. wt 


Percent identity 
(vs. Exoli) 


E.coli 


852 


54.3 


100 


284 


32.5 


100 


C.freundii 


852 


52.9 


89.9 


284 


32.6 


94.4 


E.cloacae 


855 


54.6 


86.5 


285 


32.7 


92.3 


S.marcescens 


855 


57.8 


80.7 


285 


32.7 


84.9 


P.mirabilis 


852 


43.1 


72.8 


284 


32.6 


80.3 


^aeruginosa 


852 


62.7 


66.4 


284 


32.6 


60.7 


A.tumefaciens 


900 


60.4 


45.4 


300 


34.4 


35.6 


Zmobilis 


906 


49.4 


46.4 


302 


34.5 


38.6 



All the numbers and values refer to the coding sequences. Percent identity was based on a comparison with E.coli rpoH or a 32 . Pairwise alignments of either nucleo- 
tide or amino acid sequences were carried out using the ALIGN 0 program, version 17 (28). 
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Figure 2. Sequence similarity among the RpoH and RpoD families of a factors. The sequence data of Figure 1 (RpoH homologs) and that of the RpoD family (Ecoli 
RpoD, A00699; E.coti RpoS/KatF, S 14901; Raeruginosa RpoD, S 15900; A.tumefadens SigA, A36913; Mjumthus SigA. M32347; BJubtilis RpoD, A22626; 
Raeruginosa RpoS, D26 1 34; Lactococcus lactis RpoD, JC1 397) were aligned with ICOT Free Software (24). Three representative bacteria in which bom RpoD and 
RpoH sequences are known (Ecoli, Raeruginosa and A.tumefaciens) were then used to calculate similarity scores using PLOTSIMILARITY (in WISCONSIN 
PACKAGE version 8, Genetics Computer Group Inc.). Thick and thin lines show similarity within the RpoH or RpoD famines, respectively, whereas the dotted line 
shows similarity with both RpoH and RpoD sequences combined. The scores have been normalized to the average value (0.0) for each plot (0.8, 1.2 and 0.7 for the 
above three lines, respectively). A plateau level for the conserved 2. 1-Z4 region of RpoD represents the maximum value obtained (perfect match). A narrow region 
between 2.4 and the RpoH box with apparently low similarity for RpoH is due to an artifactual gap introduced during co-alignment of the RpoH and RpoD sequences. 



In the case of RpoD the very highly conserved region 2. 1-2.4 
was immediately followed by a region of much lower similarity. 
With RpoH, however, the conserved region 2.4 was followed by 
another region of high similarity, comparable with that of the 
preceding region. In particular, a stretch of nine amino acids 
Q(R/K)(K/R)LFFNLR (residues 132-140 for E.coli) was the 
most highly conserved in the entire sequence and was tentatively 
designated the 'RpoH box'. In contrast, pairwise comparisons of 
the nucleotide sequence for the same segment revealed appreci- 
able variation between different species (up to eight changes in 27 
nucleotides; not shown). Furthermore, similarities for region 2.4 
and the RpoH box found with the combined RpoD and RpoH 
sequences were much lower than those obtained with RpoH (or 
RpoD) alone, indicating that these two regions represent unique 
and characteristic features of RpoH homologs. Comparisons 
between RpoH and other a factors also gave results that are 
consistent with the notion that the RpoH box represents a 
sequence unique to this family of a factors (not shown). 

Most significantly, the RpoH box overlapped with region C 
(around residues 122-144), which was previously shown to be 
involved in DnaK/DnaJ-mediated translational shut-off of c 32 
synthesis during the adaptation phase and in the characteristic 
instability of a 32 , on the basis of observations with deletion and 
frame-shift derivatives of a rpoH-lacZ gene fusion (11). Thus 
region C, which contains at least part of region 2.4 and perhaps 
the entire RpoH box, is very well conserved among all the RpoH 
homologs examined (see Fig. 1). 

Conserved regulatory elements for translational control 

A sequence similar to the downstream box, which is complemen- 
tary to 16S rRNA, was found irnmediately following the initiation 
codon of rpoH mRNAs for all the y proteobacteria analyzed 
(Fig. 3). The complementarity was as high as that previously 
found in Ecoli (65-80% matching), strongly suggesting that 
these sequences play an active role in enhancing translation of the 



Eco 
Cfr 
Eel 
Sma 
Fmi 



AUGAC LXSACAAAAUGCAAAC JUUA 



gutfaguacuuagujuuuciaccauu< 



guc aguacuuaguguuuc siccauucgcg 



guc aguacuuaguguuuc accauucgcg 
AUGAC CAAAGAAAUOCAAAC [JUUA 

gnc aguacuuaguguuuc iccauucgcg 
AUGAC ACAAGAAAUGCAAUC 2UUA 

guc aguacuuaguguuuc accauucgcg 
AUGAC CACUUCUUUGCAAOC 0GDA 
caguacuui g^gaggcaccauugg ca .... 5 ' 



1459 

ucgeg. . . .S* 



. . .5' 
...5* 
.. .5* 
. . .5" 



Figure 3. The 5 '-portions (nt 1 -24) of coding sequence contaming the putative 
'downstream box* of each rpoH homolog (upper sequence) were shown to be 
complementary to part of 16S rRNA ('anti-denvnstream box, spanning nt 
1469-1483 in Ecoli, close to the 3'-end; lower sequence): of the respective 
bacteria. The sequence data used, besides those listed in the legend to Figure 1, 
were: Eco rpoH, X04398; Eco 1 6S rRNA (see 10);Cfrrpo//;X14960:Cfr 16S 
rRNA, M59291 ; Sma 16S rRNA, M59160; Pvu 16S rRNA, X07652; Pae 16S 
rRNA, M341 33. The anti-downstream box of Raeruginosa is shifted by eight 
bases toward its 5'-end, as compared with that of the others. Because the 16S 
rRNA sequences for Ecloacae and Rmirabilis are not known, those from close 
relatives (Ecoli and Proteus vulgaris, respectively) were used. 



respective rpoH mRNA in these bacteria, in addition to the 
similar role played by the Shine-Dalgarno sequence. It should be 
noted that the anti-downstream box that matches with Raerugino- 
sa rpoH was slightly shifted from those of the rest of the bacteria 
examined. This probably explains the previous failure to detect 
such a sequence (18). No similar sequence was found with the 
A.tumefaciens gene, and the lack of 16S rRNA sequence data 
prevented analysis of the Zmobilis gene. 

We then examined potential mRNA secondary structures for 
the 5'-proximal portion (nt -20 to 210) of the rpoH coding 
sequence, which could be involved in thermal regulation of its 
translation. The predicted mRNA secondary structures for all the 
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Figure 4. Possible secondary structures for the 5 '-portion of mRNA (nt -20 to 210) of rpoH homologs from y proteobacteria. The structures shown were among the 
most stable predicted by MULFOLD (26) and visualized by LOOFVTEWER (27) with one exception. P.mirabilis. The minimum energies (AG, kcal/mol) were 
calculated to be -60.9 (Eco), -59.4 (Cfr), -5 1 .5 (Eel), -55.1 (Sma), -53.5 (Pmi) and -70.9 (Pae), which should be compared with the values indicated for each of the 
structures shown. The initiation codon is shaded and the conserved base pairings are boxed (see text). 



y proteobacteria were surprisingly similar to that reported for 
Kcoli; the structures presented in Figure 4 were found as one of 
the most stable structures for each of the rpoH homologs examined 



(except for P.mirabilis); the three major stems thought to play 
unique regulatory roles (9) were well conserved. The similarity 
extended further, to, apparently, the two most critical adjacent 
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Eco ttfSRCTfatggataaaatcactdFCTGXfcaaaa 76 

Cfr 1 jtggataaaatcacts 1CTGA :aaaa 92 

Ec 1 t$ 3AACT1 j tgga t aaaa tcac tg rCTGA :aaaa 8 9 

Sma gt 5AACTT t tagggcggagtacgg rCAAA lattg 121 

Prai atSAACClttagtctttatgctgs rCAAT latgc 115 

Pae a eCAACnk tacacceoc t taca qyCftGAk a t cc 30 
-35 -10 



CAGACCAIt t tgaKTCT 
SAGAGGA :ttga &TG 



SAOAGGA it-ga \TG 
SAQACGG tttga \TG 
ATGAGGA tttga WG 
CGOAGGAJ t. tCQC RTG 



LLA-D 



Figure 5. Predicted promoters found in the upstream region of rpoH 
homologs are shown along with that known in E.coli (see 6) and that predicted 
for Raeruginosa (17,1 8). SD, Shine-Dalgamo sequence. The numbers refer to 
nucleotides that should be inserted between the pomoter region and SD 
sequence. 



G-C pairings involved in thermal regulation of Exoli rpoH (nt 
15:: 124 and 16:: 123): they were perfectly conserved among all 
the mRNA structures shown (shaded areas in Fig. 4). These 
results, along with the recent report on Raeruginosa rpoH (18), 
provide strong evidence in support of the notion that mRNA 
secondary structure plays an important role in the control of rpoH 
translation in E.coli and, by inference, in other members of the y 
proteobacteria examined. 

Comparison of flanking nucleotide sequences 

Among the known promoters of E. coli rpoH, three are transcribed 
by a 70 RNA polymerase (Eo 70 ) and one by 0 s RNA polymerase 
(Ea E ) (12). A search for putative promoters with the rpoH 
homologs led us to find a putative promoter in front of each 
of the rpoHs of all the y proteobacteria (Fig. 5). In contrast, we 
failed to identify any promoters) similar to the *a 70 -conseiisus\ 
although at least one such promoter is probably active. In 
Raeruginosa two major rpoH transcripts were reported to be 
produced when the gene was expressed in E.coli upon shift to 
50° C, though the start site for a putative (^-dependent transcript 
did not quite agree with the predicted location of the promoter 
(17,18). On the basis of sequence data alone we failed to detect 
either o 70 -like or o^-like promoters upstream of the rpoH 
homologs from the a subgroup (A.tumefaciens and Zmobilis). 
Also, none of the rpoH genes examined contained promoter 
sequences similar to the o 32 -consensus. On the other hand, a 
typical transcription terrninator sequence, consisting of a hairpin 
structure followed by a T(U) cluster, was found shortly after the 
termination codon on mRNAs of all the rpoH homologs analyzed 
(not shown). 



995 




Figure 6. Consensus phylogenetic trees of a factors from gram negative 
bacteria were analyzed by the neighbor joining and parsimony methods. The 
entire sequences of region 2 were aligned with ICOT Free Software and used 
to obtain a consensus tree after 1000 bootstrap replications. Numbers given near 
the forks indicate calculated probabilities (per thousand) that the given species 
shown to the right should be grouped together, by means of a 1000 bootstrap 
analysis. The sequence data used included those used in Figures X and 2 plus 
Mjcanthus SigB (MXA-B, X55500X Mjcanikus SigC (MXA-C, L12992), 
Rcoli FliA (ECO-F, P3 1 804), Raeruginosa FliA (PA&F. S20544) and Kcoli 
RpoE (ECO-E, P34086). Only the results from the neighbor joining analysis are 
shown; essentially identical results were obtained in the parsimony analysis. 



Phylogenetic relationships among RpoHs and other o 
factors 

Finally, we analyzed phylogenetic relationships among RpoH 
homologs, as well as other a factors of gram negative bacteria, 
including RpoD (o 70 ), RpoS (0 s ), RpoE (<P) and RpoF (a 28 ). 
RpoN (a 54 ) was excluded from this analysis, because it was 
known to be virtually unrelated to any of the other a factors (see 
32). The generally conserved region 2. 1-2.4 was compared using 
two independent methods, which yielded essentially identical 
results (Fig. 6). It seems evident that the RpoH homologs form a 
distinct cluster, separate from the rest of the a factors, suggesting 
that the RpoH homologs originated from a common ancestor 
during evolution. The sigB and sigC gene products of Myxococ- 
cus xanthus were apparently most closely related phylogeneti- 
cally to the RpoH homologs among the o factors whose 
sequences are currently available in the databases. 



DISCUSSION 

The present analysis of RpoH homologs from the a and y 
subgroups of the proteobacteria revealed that they form a distinct 
cluster in the phylogenetic tree among the c factors of gram 
negative bacteria (Fig. 6). This is consistent with the observation 
that these homologs can closely mimic the function of o 32 when 
expressed in Exoli (ICNakahigashi et al. y impubUshed) r and 
suggests that the regulatory, as well as catalytic, functions ofrjr? 2 
are well conserved among these bacteria. The only other o factors 
closely related to RpoH were SigB (MxaB) and SigC (MxaC) of 
Mjcanthus, which belongs to the 5 subgroup of the proteobacteria 
(see Fig. 6; 3 1 ). However, when the genes for these a factors were 
prepared by PCR and mtroduced into the E coli ArpoH mutant 
they failed to complement temperature-sensitive growth of the 
mutant, even at 30°C (ICNakahigashi, unpublished). 
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The segments of a 32 particularly well conserved among these 
homologs, but distinct from those of other c factors, overlapped 
with region 2.4 and an adjacent region which contained nine residues 
of the 'RpoH box' (Fig. 1). The high conservation of region 2.4 
specifically among the RpoH homologs was not unexpected, 
because this region is known to be primarily responsible for 
recognition of -10 promoter sequences (32). In agreement with 
this, all the RpoH homologs tested were capable of promoting 
synthesis of at least some of the major heat shock proteins when 
expressed in the Exoli krpoH mutant (K.Nakahigashi et al. y 
unpublished). 

Although the RpoH box located outside region 2.4 might also 
reflect a unique promoter specificity of this a factor, it seems most 
likely that this region is involved in regulation characteristic of 
the RpoH homologs, because it was found to overlap with region 
C of Exoli a 32 , supposed to be critical for DnaK/DnaJ-mediated 
negative control of its synthesis and degradation (11). It has been 
shown that strains carrying the rpoH-iacZ gene fusion with a 
deletion or frame-shift mutation affecting region C failed to shut 
off synthesis of fusion protein during the adaptation phase of the 
heat shock response (11,12). Moreover, the fusion proteins 
produced from these mutant derivatives are very stable, unlike the 
parental fusion protein, which is as unstable as authentic a 32 (11). 
Recently it was found that peptides of 1 3 amino acids that contain 
the RpoH box actually bind to DnaK with the highest affinities 
among a set of overlapping peptides spanning the entire a 32 
polypeptide (J.McCarty and B.Bukau, personal communication) 

We propose that the RpoH box, perhaps with its flanking 
sequence(s), is specifically required for chaperone-mediated 
negative control of the synthesis and/or degradation of RpoH 
proteins in a variety of gram negative bacteria This would imply 
that the RpoH protein itself is involved in regulation of the heat 
shock response in all these bacteria, as had been documented in 
Exoli (6,12). This is interesting, but perhaps not surprising, 
because feedback control of the heat shock response is likely to be 
universal, involving transcription factors such as a 32 and eukaryo- 
tic heat shock factors (HSF) on the one hand and chaperones such 
as DnaK/DnaJ and HSP70 on the other, not only in prokaryotes but 
also in many eukaryotic organisms (1,12). 

Besides the RpoH box and region C discussed above, the 
5'-segment of mRNA immediately downstream of the initiation 
codon (downstream box; Fig. 3) and the mRNA secondary 
structure with its major stems (Fig. 4) is highly conserved among 
most (if not all) members of the y proteobacteria, despite the wide 
variation in nucleotide composition and sequence among some of 
these homologs. These results strongly suggest that translational 
repression of RpoH proteins, as mediated by mRNA secondary 
structure during steady-state growth, and the rapid and transient 
activation following heat shock is a well-conserved mode of 
regulation among these bacteria. It is therefore likely that the 
regulatory mechanisms of the heat shock response in these bacteria 
are quite similar to what has been found in ExolL A similar 
analysis with a putative rpoH homolog of Haemophilus influenzae 
Rd (33), an additional member of the y subgroup, agreed well with 
this expectation (ICNakahigashi et al., unpublished). 

In the a proteobacteria, including A.tumefaciens md Zmobilis, 
transcription of a number of heat- or ethanol-inducible genes 
appears to be initiated from promoters similar to the *a 32 -consen- 
sus' found in ExolL This includes vi'rC of AJumefaciens (34), 
dnaKoi Caulobacter crescentus (35) and adhB of Zmobilis (36). 
Besides, the Exoli dnaK heat shock gene was shown to be 



transcribed from known heat shock promoters) when expressed 
in AJumefaciens, suggesting the involvement of a a 32 -like factor 
in this bacterium (37). On the other hand, transcription of the groE 
(groES-groEL) operon in A.tumefaciens was reported to be 
initiated from a c 7 °-Iike promoter before or after heat shock (38). 
More direct experiments in vivo and in vitro would be required to 
resolve the apparent discrepancy between this and the other 
experiments cited above. 

It should also be noted that a well-conserved inverted repeat 
sequence similar to that found in gram positive bacteria is present 
in front of the groE operon in both of the a proteobacteria 
examined, AJumefaciens and Zmobilis (38,39). Since such 
inverted repeats have so far been detected only in the groE and 
dnaK operons in gram positive or negative bacteria, their roles in 
heat shock regulation may be restricted to genes encoding some 
of the major chaperones (15). The functional interplay of the 
positive (a 32 ) and negative elements (inverted repeats) in heat 
shock regulation of a proteobacteria would be an interesting 
subject for further study. Finally, extension of the present work to 
include additional members of the proteobacteria, as well as other 
distantly related bacteria, may provide useful information on the 
phylogenetic distribution of a 32 and other regulatory factors 
involved in the heat shock response in prokaryotes. 
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